THREE TOPICS IN ADDITIVE PRIME NUMBER THEORY 



BEN GREEN 



Abstract. We discuss, in varying degrees of detail, three contemporary themes in 
prime number theory. Topic 1: the work of Goldston, Pintz and Yildirim on short 
gaps between primes. Topic 2: the work of Mauduit and Rivat, estabhshing that 50% 
of the primes have odd digit sum in base 2. Topic 3: work of Tao and the author on 
hnear equations in primes. 



Introduction. These notes are to accompany two lectures I am scheduled to give at 
the Current Developments in Mathematics conference at Harvard in November 2007. 
The title of those lectures is 'A good new millennium for primes', but I have chosen 
a rather drier title for these notes for two reasons. Firstly, the title of the lectures 
was unashamedly stolen (albeit with permission) from Andrew Granville's entertaining 
article [16] of the same name. Secondly, and more seriously, I do not wish to claim that 
the topics chosen here represent a complete survey of developments in prime number 
theory since 2000 or even a selection of the most important ones. Indeed there are 
certainly omissions, such as the lack of any discussion of the polynomial-time primality 
test [2], the failure to even mention the recent work on primes in orbits by Bourgain, 
Gamburd and Sarnak, and many others. 

I propose to discuss the following three topics, in greatly varying degrees of depth. 
Suggestions for further reading will be provided. The three sections may be read inde- 
pendently although there are links between them. 

1. Gaps between primes. Let p„ be the nth prime number, thus pi = 2, p2 = 3, and so 
on. The prime number theorem, conjectured by Gauss and proven by Hadamard and 
de la Vallee Poussin over 110 years ago, tells us that Pn is asymptotic to nlogn, or in 
other words that 

lim — ^ — = 1. 

n^oo n log n 

This implies that the gap between the nth and (n + l)st primes, Pn+i — Pn-, is about 
log n on average. About 2 years ago Goldston, Pintz and Yildirim proved the following 
remarkable result: for any e > 0, there are infinitely many n such thatpn+i— Pn < elogn. 
That is, infinitely often there are consecutive primes whose spacing is much closer than 
the average. 

2. Digits of primes. Written in binary, the first few primes are 



10, 11, 101, 111, 1011, 1101, 10001, 10011, 10111, 

1 
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There is no obvious patteri^j. Indeed, why would there be, since the definition of 'prime' 
has nothing to do with digital expansions. Proving such a statement, or even formulating 
it correctly, is an entirely different matter. A couple of years ago, however, Mauduit 
and Rivat did manage to prove that the digit sum is odd 50% of the time (and hence 
even 50% of the time). They also obtained results in other bases. 

3. Patterns of primes. Additive questions concerning primes have a long history. It 
has been known for over 70 years that there are infinitely many 3-term arithmetic 
progressions of primes such as 3, 5, 7 and 5, 11, 17, and that every large odd number is 
the sum of three primes. Recently, in joint work with Tao, we have been able to study 
more complicated patterns of primes. In this section we provide a guide to this recent 
joint work. 

Throughout these notes we will write 

xex 

where X is any finite set and / : X ^ C is a function. 



1. Gaps between primes 



These notes were originally prepared for a series of lectures I gave at the Norwegian 
Mathematical Society's Ski og mathematikk, which took place at Rondablikk in January 
2006. It is a pleasure to thank Christian Skau for inviting me to that event. The 
argument of Goldston, Pintz and Yildirim was first described to me by K. Soundararajan 
at the Highbury Vaults in Bristol. It is a pleasure to thank him, and to refer the 
interested reader to his lectures on the subject [41j, which are superior to these in every 
respect. 



1.1. The result. In 2005 Goldston, Pintz and Yildirim created a sensation by an- 
nouncing a proof that 

1. ■ r Pn+i — Pn ^ 

limmf„^oo — = 0, 

logn 

where Pn denotes the nth prime number. According to the prime number theorem we 
have 

Pn^ n log n, 

and therefore 

Pn+l - Pn 

logn 

has average value 1. The Goldston, Pintz and Yildirim result thus states that the 
distance between consecutive primes can be e of the average spacing, for any e, and is 
thus certainly most spectacular. 



Except, of course, that the last digit of primes except the first is always 1. 
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Previous efforts at locating small gaps between primes focussed on proving successively 
smaller upper bounds for C := liminfyt^oo ^"iog~^" • The following table describing the 
history of these improvements does not make the Goldston-Pintz-Yildirim result look 
any less striking: 
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Hardy-Littiewood [27] 


1926 


2/3 


on GRH 


Rankin [39] 




O/ 


on Vorixrl 


Erdos [H] 


1940 


1 - c 


unconditionally 


Ricci [H] 


1954 


15/16 




Bombieri-Davenport [1] 


1965 


0.4665 . . . 




Pilt'ai [38] 


1972 


0.4571 . . . 




Uchiyama [17] 




0.4542 . . . 




Huxley [29l [301 [31] 


1984 


0.4393 . . . 




Maier [35] 


1989 


0.2484 . . . 




Goldston-Pintz-Yildirim 


2005 








For the detailed proof of this result we refer the reader to the authors' paper [13], as 
well as to their expository account [T2] and to their short article with Motohashi [14]. 
Our aim here is to give a very rough outline of the proof. One distinctive feature of 
the argument is that it 'only just' works, in a way that seems rather miraculous. We 
will endeavour to give some sense of this. We begin with two sections of background 
material. 



1.2. The Elliott-Halberstam Conjecture and level of distribution. Let 
g be a positive integer and suppose that a is prime to q. We write 

ll){N] a, q) := IE„<;Ar,„=a(mod<7)A(n), 

where A is the von Mangoldt function. For constant q (and in fact for q growing slowly 
with A^, say q ^ (logA^)^ for some fixed A) the prime number theorem in arithmetic 
progressions tells us that 

V'(Ar;a,g)~l/</)(g). 

Conditional upon the GRH, we may assert the same result up to about q ~ N^/'^ . The 
remarkable theorem of Bombieri- Vinogradov (a proof of which the reader will find in 
many texts on analytic number theory, such as [32]) states that something like this is 
true unconditionally, provided one is prepared to average over q. A weak version of the 
theorem is that 

V max \il){N] a, q) - <a,e (1-1) 

^(«'9)=i 0(g) (logA^)^ 

for any fixed A and for any Q ^ N^^'^~^. 
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By using sieve theory one may show that ip{N] a, q) <^ N/(j){q), and so the LHS of fll.ll) 
is trivially bounded by 

The Bombieri- Vinogradov theorem permits us to save an arbitrary power of a logarithm 
over this trivial bound. 



Even conjectures on L-functions (such as the GRH) appear to tell us nothing about 
the expression fll.ll) when Q ^ N^^"^. Nonetheless, one may make conjectures. If 
6 G [1/2, 1) is a parameter then we say that the primes have level of distribution 9, or 
that the Elliott-Halherstam conjecture EH(6') holds, if we have the bound 



ma^J^(iV;a,g) - ^| ^^^^ (1.2) 

for any Q ^ N^. 



The full Elliott-Halberstam conjecture J7] is that EB.{6) holds for all 6 < 1. Assuming 
any Elliott-Halberstam conjecture EH(^) with 6 > 1/2, Goldston, Pintz and Yildirim 
can prove the remarkable result that gaps between consecutive primes are infinitely 
often less than some absolute constant C{9). Assuming EH(0. 95971), they prove that 

liminf„^oo(Pn+i - Pn) ^ 16 

(actually they prove a slightly weaker result - the value 0.95971 comes from unpublished 
computations of J. Brian Conrey). 

It should be stressed however that it is not expected that any conjecture EH(^) for 
^ > 1/2 will be established in the near future. There are results of Bombieri, Friedlander 
and Iwaniec which go a little beyond the Bombieri- Vinogradov theorem in something 
resembling the required manner, although experts seem to be of the opinion that these 
results will not help to improve the bounds on gaps between primes (cf. [U §16]). 



1.3. Selberg's weights. This is the second section of background material. 

In the 1940s Selberg introduced a wonderfully simple, yet powerful, idea to analytic 
number theory. Write 1 p for the characteristic function of the primes. Then if R is any 
parameter and if {Xdjd^s^R is an sequence with Ai = 1, we have the pointwise inequality 

d\n 

provided that n > R (the proof is obvious). 

This provides an enormous family of majorants for the sequence of primes. In a typical 
application we will be interested in something like the set of primes p less than some 
cutoff A^, and then R will be some power A^'^, 7 < 1. In this situation Selberg's weights 
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majorise the primes between A^'^ and A^, that is to say almost all of the primes less than 
N. 

What weights should one choose? This depends on the application, but a very basic 
application is to the estimation of ^{7i{x + y) — T^i^)), the density of primes in the 
interval {x,x + y] (the Brun-Titchmarsh problem). In discussing this problem we will 
also see why it is advantageous to construct a majorant for the primes, rather than work 
with Ip itself. 

For any choice of weights A^, then, we have 

-(7r(a; + y) - n{x)) ^ IE^+i<;„<jx+y ( A^)^ 

^ d\n 

d<R 



d^Rd'^R 

(1-3) 



\d,d'] y 

di^^Rd'^R L ' J y d!^Rd'!^R. 

Let US imagine that the weights A^ are chosen to be -C y*^ in absolute value (this is always 
the case in practice). Then the second term here is 0{R^y^'^^^). If i? ^ yi/2-2e i^j^en 
this is 0{y~^) and may be thought of as an error term. This is why it is advantageous 
(indeed essential) to work with a majorant taken over a truncated range of divisors, and 
not with Ip itself. 

The first term in (11.31) . 

d!^Rd'^R ^ ' J 

is a quadratic form. It may be explicitly minimised subject to the condition Ai = 1, 
giving optimal weights A^^"*^ which are independent of x and y, and the resultant 
expression may then be evaluated asymptotically. In this way one obtains the well- 
known bound 

\,,^a: + y)-nix))^{2 + e)^, 

y y 

vahd for y > yo{e). 

What is the optimal choice of weights A^^^ for the Brun-Titchmarsh problem? The 
precise form will not concern us here (see, for example, [3Z])- However, it may be shown 
that 

^^"^^ logi? • 

(For a detailed discussion, see the appendix to [20] and the references to work of Ramare 
therein.) 



We write 
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These weights are very natural for two reasons: their simphcity of form, and the fact 
that they approximate the optimal weights for the Brun-Titchmarsh problem. There is 
a third reason for considering them, which comes upon recalling the formula 



A{n) = \og{n/d). 

d\n 

We see, then, that 

d\n 

is a kind of divisor-truncated version of A. 

We have arrived at the conclusion that the function 

^ ^ d\n 

might be a very useful majorant for the primes. 

What might we hope to do with such a majorant? By the computation leading to (11. 3p . 
we see that it is possible to find an asymptotic for 

EN<in<2NAR{ny (1.4) 

provided that R ^ A^^/^~^. Later on we will wish to consider more complicated expres- 
sions involving genuine primes, such as 

EN^n<2NA'{n + 2)AR{nf. (1.5) 

Here we write A' for the von Mangoldt function restricted to primes (as opposed to 
prime powers), thus 

logn if n is prime 
otherwise. 

The problem of evaluating (ll.4p may be thought of as a kind of approximation to the 
twin prime problem, though we do not know of a way to relate this expression to that 
problem rigourously. Expanding out, we see that (II. 5p is equal to 

J2 Yl Mfi{d') \og{R/d) \og{R/d')W.N^n<2NA{n + 2). 

d<^Rd'<^R [^'^'11" 

Now we expect that 



A\n) :-- 



EN^n<2NA\n + 2) ^ (1.6) 
[d,d']\n (P{[d,d'\) 



if both d and d' are odd. 



The Bombieri- Vinogradov theorem clearly offers a chance of obtaining a statement to 
this effect on average over d, d' ^ R if R ^ N^^^~^, though there is certainly still work 
to be done as the distribution of [d, d'] as d, d' range over d,d' ^ R is not particularly 
uniform. For the details (which involve moment estimates for divisor functions) see 
§9]. 
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Once this is done we are left with the main term 

E E ^'°8(«Miog(«/d'). (1.7) 

d odd d' odd 

This term (and related expressions) may all be estimated rather accurately using the 
standard Dirichlet series techniques of analytic number theory, whereby the sums are 
expressed as integrals involving products of ^-functions. 

If we had EH(^), that is to say if the primes had level of distribution 6, one could 
show that (11.51) is roughly (11.71) in the wider range R ^ iV^/^"*^. In particular on the 
full EUiott-Halberstam conjecture one could work in the range R ^ N^^'^'~'^, which is 
essentially the same as for (II. 4p . 

Perhaps we should make a few remarks about the form of the asymptotic for (11.71) . One 
may in fact show that it is 

We will see products such as this again in ^ 

1.4. A STRATEGY FOR GAPS BETWEEN PRIMES. We now trun to a discussion of the 
results of Goldston, Pintz and Yildirim themselves. From the conceptual viewpoint it 
is easiest to begin by discussing the very strong conditional results proved under the 
assumption of the Elliott-Halberstam conjecture. We stated in the introduction that 
they prove 

liminf„_oo(Pn+i - Pn) ^ 16 (1.8) 
assuming EH(^) for some 6 less than 1. 

In fact, a much more general result is obtained. Let H = {hi, . . . , hk} be a fc-tuple of 
distinct integers with hi < h2 < ■ ■ ■ < h^. A generalisation of the twin prime conjecture 
is that there are infinitely many n such that all of n + /ii , . . . , n + /i^ are prime unless 
this is "obviously impossible for trivial reasons", which would be the case if there is 
some p such that {hi, . . . , h^} occupy all residue classes (mod p). If this is not the case 
then we say that Ti is admissible. 

Goldston, Pintz and Yildirim prove the following. 

Theorem 1.4.1. Suppose that EH(6') is known for some 9 > 1/2. Then there is ko{6) 
with the following property. If k ^ ^o(^) ^^^^^ ^/ = {hi,...,hk} is an admissible 
k-tuple then for infinitely many n at least two of the numbers n + hi, . . . ,n + h^ are 
prime. 

Note in particular that 

liminf„^oo(Pn+i - Pn) ^ min {hk - hi). 

{hi,...,hk} admissible 
fc^fco(6») 
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It turns out that ko{6) can be taken to be 6 for 6 > 0.95971, and this leads to fll.81) 
since the 6-tuple {0, 4, 6, 10, 12, 16} is admissible. 

Here is a very general strategy for detecting primes in admissible tuples. According 
to [I2], this has its origins in work of Selberg and Heath-Brown. Fix a range [N, 2N), 
suppose that ^ hi < ■ ■ ■ < hk ^ N , and let (/^n) Af^n<27v be arbitrary non-negative 
weights which are certainly allowed to depend on the hi. We will compare 

Ql '■= '^N^n<2NfJ'n 

with ^ 

Q? ■= — :7ir^^N<in<2N^'{n + hi)llri, 
log 6JS 

for i = 1, . . . , fc. T. Tao remarked to me that a nice way to think of this as follows: 
one may renormalise so that = 1, and then p^*) := Q2^/Qi is essentially the 

probability that n + hi is prime if n is drawn at random from the distribution n (one 
might then write the expected values in the definitions of Qi and as integrals with 
respect to fi). 

Now if one can choose the weights /i so that p^*) > 1/k, we will have upon summing 
over i = 1, . . . , k that 

^ A'{n + hi) . 

^N^n<2N[2_^ lQg3JY Ij/^n > 0, 

which means that there is some n such that 

A'{n + hi) + ---+ A'{n + hk) > log 3N. 

For such an n, at least two ofn + hi, . . . ,n + hk are prime. In the probabilistic language, 
we have essentially used the fact that if n + hi, . . . ,n + hk are each drawn at random 
from p, the expected number of primes amongst these numbers is > 1. 



1.5. Choosing good weights. We continue the discussion of the previous section. 
How should the weights /i„ be chosen to optimise the factors p*-*-*? In retrospect, one 
may view most of the earlier developments on gaps between primes as attempts to find 
good weights p„ in this context - see [131 §4] for further remarks on this. 

A very good choice of weights might be 

p„ := A'{n + hi)...A'{n + hk). 

One would indeed expect that p*-*-* ^ 1 in this case. The only problem is that we have 
no idea how to prove this, one particular issue being that we cannot show that Qi ^ 
(indeed, this is equivalent to finding n such that all of n + hi, . . . ,n + hk are prime). 

We must restrict ourselves to weights p„ for which it is possible to estimate Qi and 
. As we saw in §1.31 there is a rather large class of such weights. In fact if 

d\n 
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then our chances are good if i? ^ A^^/^~^, where 9 is the best exponent for which we 
know EH(^). Crucially, there is a rather more general class of weights one is able to 
consider, and that is weights of the form 

/in =(5^ A,)', (1.9) 

d\F{n) 

where F{n) = (n + ai) . . . (n + am) is an integer polynomial. It is an interesting exercise 
to reprise the arguments of §1.31 in this more general context. In place of the trivial 
estimate ^ 

E„<;Ar,[d,d']|„l = j- + 0(1) 

which we used to derive (11. 3p one must instead input information about 

E„^Arjd,d']|F(n)l- 

But one knows that (for example) p\F{n) precisely ii n = —aj{modp) for some j G 
{1, . . . , m}, and so such a task is not too frightening. The same is true of sums, such 
as (II. 6p . which involve the von Mangoldt function. 

For the remainder of the discussion we will narrow down our search for good weights 
to those having the form (11.91) . We will assume that for any sensible choice of F 
and the we may evaluate Qi and Q^'^ for R ^ N^/^- using the standard techniques 
which we sketched in §1.31 

Now we remarked that an ideal choice of weights is 

/i„ = A'{n + hi) . . . A'{n + hk), 
but we cannot compute with this choice. A closely related choice of weights is 

fin = Afc ((n + hi) ...{n + hk)) . 
Here the function A^ is defined by 

d\n 

and so in particular Ai = A. For general k the function A^ is supported on those integers 
with at most k distinct prime factors. (One way to check this is to use the identity 

Afc = LAfc_i + A* Afc_i, 

where L{n) := logn and * denotes Dirichlet convolution.) 

Now in §1.31 we saw the advantages of replacing A with A^j, a divisor-truncated version 
of it. By analogy one might consider 

AkAn) ■.= Y,li{d)\og{R/d)\ 

d\n 

This could be negative, but its square A^^ certainly cannot. Furthermore 

fin ■= J^lfiiin + hi)...{n + hk)) 
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is of the form fll.9p (with F{n) = {n + hi) . . . (n + hk)). This, then, is a very natural 
choice of the weights /i„ and it is with some anticipation that we await the results of 
computing Qi and the \ and hence the factors What we obtain is this: 

(,) 2 \ogR 
^ k+l \ogN 

This is something of a disappointment, since it is not greater than 1/k even when 
R = for 9 very close to 1 (that is, with a very strong form of the EUiott- 

Halberstam conjecture). 

Astonishingly, a seemingly small change tips the balance in our favour. We consider 
instead 

/in := ^l+i,R{{n + hi)...{n + hk)), 

where < / ^ /c is a further parameter. In the probabilistic language, p*-*-* may then be 
thought of (very roughly) as something like the probability that n + hi is prime given 
that {n + hi) . . . {n + hk) has at most k + / prime factors. 

With this choice of weights it is possible to compute that 

p« ^ gl±i . (1.10) 

^ k + 2l + l l + l log AT ^ ^ 

Note that as A;, / oo with / = o{k), this is essentially 4 log i?/ A; log A^. In particular 
with R := N^/'^~'^ one has p*-*-* > 1/k for k ^ ko^O), for any 9 > 1/2. This is enough to 
establish Theorem 11.4.11 



1.6. The unconditional result. In this section we explain some aspects of the 
proof that 

hmmf^^oo — = 0. 

logn 

Recall that in the last section we chose weights 

/in := A^^((n + hi)...{n + hk)). 

Taking R := A^^/^"*^, the quantities Qi and can be evaluated using the Bombieri- 
Vinogradov theorem (rather than the EUiott-Halberstam conjecture). If k,l are large 
with / -C k then (from (11.101) ) the quantities p^*^ are all at least 1/k — e' for k ^ ki{e), 
and hence we have that 

k 



E^<n<2^( E ^ (1 - S)Mi, (1.11) 

i=l ^ 

for any 6 > 0, and any k ^ k2{6). Here 



|1 :— IEAf^n<2Af/in 

is the total mass of p. This means that if n is drawn at random from the distribution p, 
then the expected number of primes amongst the numbers n + hi, . . .n + hk is at least 
1 — 6. Clearly, this is not an immediately applicable result if one wishes to obtain two 
or more primes in the tuple {n + hi, . . . ,n + hk}. 
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There is, however, a tiny bit of further information available to us. Even if Hq ^ 
{hi, . . . , hk}, there is still a chance that if n is drawn at random from n then u + Hq will 
be prime. One can work out that this probability is asymptotically 

&{ho,hi,...,hk) 
log AT 

where &{ho, hi, ... , hk) is a certain singular series reflecting the arithmetic properties 
of the numbers ho, hi, . . . , hk (it is similar in form to the product defining the twin prime 
constant). Formally what we mean by this is that 

A'{n + ho) &{ho,hi,...,hk).. 
log 3N log N 

and this may once again be rigorously established using the Bombieri- Vinogradov the- 
orem. 

Let rj > he arbitrary. Summing over all h^ with 

^ /lo ^ r^logA^, 

we obtain from fll.lll) that 

^''^^ A'{n + h) ^ f "^^^ G{ho,hi,...,hk) \^^ ,, 
E^^n<2;v( }^ 1 )f^n >{1-S+ ^ —- 1 Ml. (1.12) 

/i=0 ^ ^ ho=0 ^ ' 

ho^{hi,...,hk} 

For a typical choice of hi, . . . , hk, the right-hand side will be of a predictable size. Indeed 
by a result of Gallagher one may infer that 



5^ G{ho,hi,...,hk)^H'^^' 



/io,...,/ifc^-H' 
hi distinct 

as if ^ OO. Taking expectations over all /c-tuples {hi, . . . , hk} of distinct integers with 
^ hi ^ rj log A^, we therefore obtain from (I1.12p that 

h=0 

Recall that this is valid for any 6 > 0, provided that k is sufficiently large. Taking 
S = r)/2, we therefore see that (if n is drawn at random from /i) the expected number 
of primes in the interval [n,n + ri log A^] is strictly greater than 1. In particular there is 
some such interval containing at least two primes. 



1.7. Further results. Since the original paper of Goldston, Pintz and Yildirim sev- 
eral further works have appeared or are scheduled to appear giving refinements and 
variants of the main theorem. Here is a summary of what has been done: 

• In the forthcoming paper [TU] it is shown, by refining the ideas just described 
as far as seems possible, that 

1 • ■ r Pn+i — Pn 
limmt — rnTTT. ; ttt < oo. 



n— >oo 



(logp„)V2(log logp„)2 
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This remarkable result asserts that the gap between primes is infinitely often 
almost as small as the square root of the average gap. 

• Let qi < q2 < . . . be the numbers which are the product of exactly two distinct 
primes. Then 

liminf(g„+i - g„) ^ 26, 

n— >oo 

and in fact 

liminf(g„+i - g„) ^ 6 

n— ►oo 

if one assumes the Elliott-Halberstam conjecture. These results are obtained in 
the paper [H] by the three authors and S. W. Graham. 

• In the original paper [13] one also finds results concerning several primes bunch- 
ing together. Thus assuming EH(^) one has, for any r ^ 2, the bound 

liminf ^"+'^~^" ^ {V^-V29f. 

n^oo logPn 

One rather curious feature of this bound is that the full Elliott-Halberstam 
conjecture EH(1) implies that 

1 . . r Pn+2 — Pn „ 

hmmt — = U. 

n^oo \ogPn 

Nothing of this sort is known for Pn+z ~ Pn-, even conditionally. 

2. Binary digits of primes 

These notes originated from a course I lectured in Part III of the Mathematical Tripos 
at Cambridge University in the Lent Term 2007. My aim was to work through the 
result of Mauduit and Rivat in the case of binary expansions of primes, and to produce 
a reasonably short exposition (the original paper is 49 pages long). Because we provide 
complete details this section is considerably more technical than either §1 or §3. Readers 
not interested in these technicalities may still wish to read §2.31 

2.1. Statement of results. In a recent preprint Mauduit and Rivat proved 
that asymptotically 50% of the primes have odd digit sum in base 2 (and hence, of 
course, 50% of the primes have even digit sum in base 2 as well!), answering a long- 
standing question of Gelfond. Our aim in this section is to give a self-contained proof 
of their theorem in the following form, which is easily seen to imply the result as just 
stated. 

Theorem 2.1.1 (Mauduit-Rivat). Let K he the v on- Mangoldt function, and let s : N ^ 
N be the function which sums the binary digits of n. Then 

E„^xA(n)(-l)^(") = 0(X-^) 

for some 5 > 0. 

In fact Mauduit and Rivat proved rather more than this: they counted primes whose 
digit sum in base q is congruent to a(mod m), for any natural numbers a, g, m. To prove 
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a result in this generality requires some extra technical devices and a lot more notation, 
so we leave the interested reader to consult the original paper . 

I find the result intrinsically interesting, but another reason for studying it is that it 
represents a pleasingly self-contained example of Vinogradov's method of 'Type I and 

II sums' or 'bilinear forms' for handling prime number sums. 

In contrast with the last chapter we provide a fairly complete technical discussion. 



2.2. Some notation. Throughout this section X will be thought of as a large pa- 
rameter. If Yi and Y2 are two quantities which depend on X we write Yi ^ Y2 if 
Yi X~'^Y2 for all e > 0. We use the notation Yi ^ Y2 similarly. Typically we might 
have Yi ^ Y2log~*"X for some C, but the notation is occasionally applied in even 
looser situations. For example we have the bound r(x) ^ 1 uniformly in x ^ X, where 
t{x) denotes the number of divisors of x. 

If n G N we write z/2(n) for the 2-exponent of n, the maximal r such that 2^\n. 

The letter c will always denote a small, positive, absolute constant. Different instances 
of the letter will not necessarily denote the same constant. 



2.3. Vinogradov's method of Type I/II sums. Suppose that / : M C is a 
function, bounded by 1. This section concerns a method which can be often be used to 
show that a sum of the form 



is substantially smaller than X. The method is particularly inclined to work when / is 
"far" from being multiplicative. It is clear that such a sum cannot be small in many 
cases when / does have some multiplicative tendancies, for example when f{n) = fi{n) 
or when f{n) is a Dirichlet character to small modulus. 

We will develop a develop a form of the method which includes some technical refine- 
ments particularly suited to the study of the function f{n) = (—1)'^*^"). These are rather 
insubstantial and consist in large part of ensuring that various cutoffs are exact powers 
of two. In these notes we will endow the ~ symbol with a rather specific meaning: if we 
write X]n~2'' then we understand that the variable n is to range over the dyadic interval 



Here is the version of Vinogradov's method that we shall require. 

Proposition 2.3.1 (Method of Type I/II sums). Suppose that f : N ^ C is a function 
with 1/(^)1 ^ 1 for all n. Let 6 G (0, 1] be a parameter. We say that Type I sums are 
5-small if 



E^^xA{n)f{n) 
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whenever 2^' ^ X^/'^°° , C [2"-^, 2") is an interval and 2^'+'' ^ X. We say that Type 
II sums are J-small if 



^1 aj)nf{,mn)\ ^ 5 



uniformly for all sequences (0^), with \am\i\b„\ ^ 1 and for all /i, z/ 5uc/i that 
j^i/100 ^ 2^*^ 2^" ^ j^99/ioo_ Suppose that f is a function for which Type I and Type II 
sums are 6-small. Then 



Remarks. One can take C = 11/2. Note that both Type I and Type II sums are 
trivially bounded by X. The important feature of the proposition, then, is that a big 
enough gain over these trivial bounds leads to an improvement of the trivial bound on 
E„^xA(n)/(n), which is O(logX). 

In our statement of the result we have made a particular choice of the the ranges of /i, u 
for which estimates for Type I/II sums are required. There is considerable flexibility 
in the choice of these ranges. Though this matter does not concern us here, we remark 
that it has apparently not been completely clarified in the literature (though see P, §8] 
for a related discussion). 

When is an attempt to use the method of Type I/II sums likely to be successful in 
establishing a non-trivial bound for E„^xA(r;,)/(n)? In general, one might hope for 
success when / does not behave "multiplicatively" . Certainly if / is multiplicative, say 
/ = X for some Dirichlet character x, then by choosing am = fijn) and 6„ = f{n) we see 
that the Type II sums are not always small. A similar phenomenon persists for those / 
which are the sum of a few multiplicative functions, for example the additive character 
f(n) = e{an/q) with q a small integer. Fortunately one can estimate E„<jxA(n)/(n) 
for these functions by other means, namely the explicit formula and theorems on the 
location of zeroes of L-functions. 

If, on the other hand, / does not exhibit significant multiplicative behaviour then one 
may hope that the method of Type I and II sums will work. The most classical instance 
of this, worked out by Vinogradov in the course of proving that every large odd number 
is the sum of three primes, is the case f{n) = e{an) where a is not close to a rational 
a/q. Our task here is to handle the case f{n) = (— l)**^"). 



2.4. Proof of the method of Type I/II sums. Even though Proposition 12.3. II is 
not normally stated with quite the same technical refinements that we have done, the 
reader familiar with the basics of this theory may prefer to skip this somewhat technical 
section. 

Before making a start on the proof proper, we show that the assumption that Type II 
sums are small implies that Type I sums are small for a much larger range of fi than 
we have hypothesised. 
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Lemma 2.4.1 (Type II implies extended Type I). Suppose that f : {1, . . . ,N} C 

is a function with \\f\\oo ^ 1 for which Type I and Type II sums are 5-small. Then we 
have the Type I estimate 

m range 2^ ^ x^^/^o". 



Proof. Since Type I sums are 5-small, we may assume that 2^ ^ It clearly 

suffices to prove that 

^ E E ^n.lijn)fimn) « 51ogX (2.1) 

for all choices of Um with \um\ = 1- But for the cutoff l/„(n), which does not factor as 
a product of a function of m with a function of n, this looks like a Type II sum. To 
remove the cutoff, we employ a standard "separation of variables" trick. By the Fourier 
inversion formula on Z we have 

hM= CiU9)e{ne)de. 



Thus the left-hand side of (12. ip is equal to 



X 



( E Y.^rJUO)e{en)f{mn))de. 



m~2f n^2^ 

Since Type II sums are 5-small and 

|l,^(^)|«min(X,|^ri) 

we see that this is at most 



5/ rmn{X,\e\-^)dd ^5\ogX. 
Jo 



This concludes the proof. □ 

Proof of Proposition \2. 3. 11 The argument is essentially that of Vaughan |3H], but I have 
followed the beautiful presentation in the book of Iwaniec and Kowalski . We start 
with the easily-verified relation 

A(n) = 5^A(%(c). 

b,c 
bc\n 

Let U := X^/^ (say), and decompose this sum as 

A{n) = A^iin) + Ajb(n) + Abtt(n) + Abb(n), 
where b denotes "large" divisors and jj denotes "small" divisors, so that for example 

Abtt(^) := E Mb)l^ic). 

b^U,c<U 
bc\n 
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We observe that 

Ap{n) + Att,(n) = A{b) fi{c) = ln<u 

b<U c|f 

whilst 

A,tt(n) + Asj(n) = ^ /i(c) A(6) 

c<C/ bp 



^/i(c) log(n/c) 



c<U 
c\n 



Thus we obtain what is essentially Vaughan's identity, 

k{n) = -Asj(n) + Abb(n) + l„<c/A(n) + ^ /i(c) log(n/c). (2.2) 



c<r/ 

c|n 



Weighting by f{n) and summing over n ^ X we obtain 



n^X n^X n<U c<U n^X 

c\n 

= 5*1 + 5*2 + 5*3 + 5*4. 

Our objective is to use the assumption that Type I and II sums are small in order to 
bound the sums Sj. 



Bounding Si. We have 



This may be rewritten as 



where 



^1 = E A(&)/i(c) E /(^^^)- 

b,c<U n^X/bc 

E ^"^ E /("^^)' (2.3) 

m<U'^ ni^X/m 



E A(6)/x(c). 



b,c<U 
bc=m 

By splitting into O(log^X) dyadic ranges for m and n, we see that it suffices to prove 
that 

E M\ E /(^^)| « S'^'Xlog^'-'X (2.4) 

whenever C [2'^^^,2'') is an interval and 2^+*^ ^ X. This does not quite follow from 
Lemma [2.4. II since Um is not bounded. We do, however have \ujm\ <^ r(m) logX, where 
r is the divisor function, and hence since X]n<Af ''"('^)^ ^ Xlog'^X we have 



E kml' < 2^1og^X 
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This means that for any Q > we have the estimate 

^ 1.7^1 «2^Q-Mog^X 

Sphtting the sum in (12 ■4p into two parts according as \ujm\ is or is not greater that Q 
we obtain 

By Lemma 12.4.11 this is bounded by 

Q-^X\og^ X + 5QX log X. 
Choosing Q := 5^^/^ log^''^ X, we obtain a bound of the required type. 

Bounding 82- The sum 5*2 may be written as 

Y E A(6)M(c)/(6ct). 

b,c^Ut^X/bc 

Changing variables and rearranging, this may be written as 

E (^mbnfimn), (2.5) 

mn^X 

where am '■= A(m) and 6„ := J2c^u■c\n^^i^)■ Observing that 6„ = if n < f/, we see 
that the sum over m,n is covered by O(log^X) dyadic ranges m ~ 2^, n ~ 2^^ with 
XO.oi ^ 2'', 2" ^ We may therefore spht ([23]) into O(log^X) sums of the form 

E E '^i{n){m)ajjnf{mn), (2.6) 
where I{n) C [2^"^^, 2^) is an interval. It suffices to show that any such sum is <C 

5V2Xlog^-2X. 

These sums look rather like Type II sums, except for the presence of the cutoff l/(„)(m) 
and the fact that the sequences a^, &n are not bounded. 

To remove the cutoff we use the same device as in the proof of Lemma 12.4. ![ writing 
(1231) as 

1 ^ 
Since |1/(„)(6')| -C min(X, \0\^^) we see that it suffices to prove that 

uniformly for all choices of with |a^| ^ |am| and ^ for all m, n. 

This has effectively removed the cutoff that we were worried about. It remains to deal 
with the non-boundedness of the sequences (a^)m~2'', (&^)n-2''- The non-boundedness 
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of is rather minor, since clearly ^ logX for all m. Thus we may define a'^ 
a'^l \ogX and reduce to proving that 



E E ^^^nfi^n) « 5'''X \og^-' X (2.7) 



whenever |a^| ^ 1. To deal with h'^, we note that ^ Tin). Thus, by the argument 
used in dealing with Si, we have 

E |&;i«2'^g-Mog3x 

Splitting the sum in (12. 7p into parts according as ^ Q or not, we conclude as before. 

Bounding S3. The sum 5*3 is trivially bounded by U = X^^^ \ogX . The (5-smallness of 
Type I sums implies (rather vacuously) that |/(n)| ^ 6X. Since we are assuming that 
11/11 00 = 1, it follows that 6 ^ 1/X and hence this term may be absorbed into the bound 
S^/^X log^ X that we are trying to establish. 

Bounding 6*4. The sum may be written as 

/i(m) log n/(mn). 

This may be split into O(log^X) sums of the form 

E ^ K^)^ognf{mn), (2.8) 

where Im ^ [2'^^^,2'^). This is almost a Type I sum: only the presence of the logn 
term prevents it being so. This term is so smooth, however, that it may be effectively 
removed by "partial summation" . To do this, simply write 

logn = y^ - 

and rearrange (12. 8p as 

/ T E ^^^^^ Yl /("^^)' 

where Im,t ^ [2*^^^, 2^) is an interval. The smallness of Type I sums, as given in Lemma 
12.4.11 may now be used to see that this is bounded as required. 

This completes the bounding of 5*4, and hence the proof of Proposition 12.3.11 □ 



2.5. The sum-of-digits function and related functions. If n is a positive 
integer and n = X]i>o is its binary expansion, we write 



sin) := y^n^. 

i 



THREE TOPICS IN ADDITIVE PRIME NUMBER THEORY 19 

This is, of course, a finite sum. For each positive integer k we consider also the truncated 
functions 

k-l 

i=0 

These functions are closely related to s{n), of course; on any interval 2^{t + 1)) the 
functions Sk{n) and s{n) differ by a fixed constant. The truncated functions Sfe(n) are 
periodic with period 2^ , however, and this makes them rather amenable to study using 
Fourier analysis on the finite group 

In fact we will be more interested in the oscillatory functions 

f{n) 

and 

Un) := 

The functions and may, by abuse of notation, be regarded as functions on 

Definition 2.5.1 (Finite Fourier transform). Let A; ^ be a fixed integer. Then we 
define 

Jk{r) := E^ez/2*z/fe(x)e(-ra;/2*^). 

Wc now proceed to establish several properties of the Fourier transform fk which will 
be useful later on. We collect these here since they are all proved in a very similar way. 
The reader might wish to read through the proof of the first one, then skip to the next 
section. She might return here as each result is required. 

Proposition 2.5.2 (L°° bound). We have \ fk{r)\ < 2"'='= for all r e Z/2'=Z. 

Proof. We observe that 

Ur) = 2-^(1 - e{t)){l - e{2t)) ... (1 - e(2^-4)), (2.9) 
where t — r/2^. The result now follows by observing that 

|1 — e(M)||l — e{2u)\ = 4 1 sin(7rM) sin(27rM)| = 8| cos(7rM)|(l — cos^(7rM)) 

is maximised when cos^(7rM) = 1/3, and attains the value 16/3\/3 there. Thus, grouping 
terms in pairs, we see that 

fk{r) < (4/3V3)^'/2 < 2-^/10. 

This concludes the proof. □ 

Proposition 2.5.3 (L^ bound in progressions). Suppose that ^ k' ^ k and that a is 
a residue class (mod 2*^ ) . Then 

J2 |/.(r)|«2(i-)('=-'=')|/,,(a)|. 

r=a(mod2* ) 
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Remark. For the purposes of discussion suppose that k' = a = 0. Then this proposition 
states that 

J2 |/fc(r)|«2(^-)'=. 

By contrast the trivial bound (arising from Parseval's identity and an apphcation of 
Cauchy-Schwarz) for this quantity would be 2^^/^. 

One tends to imagine that a saving over a trivial bound can be expected whenever there 
is some kind of "cancellation", such as one might expect if the function fk : —>■ 
{—1,1} were chosen at random. This setting is rather different, and this represents 
perhaps the most remarkable feature of the paper [36]. It is possible to show that if fk 
is chosen randomly then 

J2 |/.(r)|»2^/l 

Our proposition therefore captures a very specific property of the functions fk{n) = 

Proof. Write S{a,k) for the sum on the left. For any r G Z/2*''Z the expansion (12.91) 
yields 

|/.(r)| = i|l-e(r/2'=)||/..i(r)|. 

Suppose that k ^ k' + 1. Since fk~i{r) is periodic with period 2^~\ we may split S{a, r) 
as a sum over two ranges ^ r < 2^^^ and 2'^^^ ^ r < 2^, thereby obtaining 

S{a,k) ^ S{a,k-1) sup |(|l + e(t)| + |l-e(t)|). 

tG[0,l] 

Now a simple exercise in Euclidean geometry confirms that 

|l + e(t)| + |l-e(t)| ^2^2, (2.10) 
with equality if and only if t = ±1/4. Hence 

S{a,k) ^V2S{a,k-l). (2.11) 
This does not, in itself, suffice to establish a nontrivial bound for S{a, k). 

If /c ^ k' + 2 one may improve things by splitting the sum S{a, k) into four ranges 
j2^-^ ^ r < (j + 1)2^-2 (j = 0, 1, 2, 3). This leads to 

3 

S{a,k)<:S{a,k-2) sup 1 V |1 - e(t + j74)||1 - e(2(t + j74))|. (2.12) 
Now two applications of f l2.10p confirm that 
X:|l-e(t + j74)||l-e(2(t + j74))| 



= |1 - e(2t)| (|1 - e{t)\ + |1 + e(t)l) + |1 + e(2t)| (|1 - e{t + 1/4)| + |1 + e(t + 1/4)|) 
^ 2^2(11 -e(2t)| + |1 + e(2t)|) 
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Furthermore equality can never occur, and so by compactness this expression is bounded 
by 8 — c for some c > 0. Comparing with fl2.12p we see that 

S{a,k) ^ {2-c)S{a,k-2). 

To complete the proof of the proposition one may apply this repeatedly, followed perhaps 
by one apphcation of (12. lip , to bound S{a, k) above in terms of S{a, k'). □ 



2.6. Analysis of Type I sums. 

Proposition 2.6.1 (Estimate for Type I sums). Suppose that 2^ ^ X^/^*^. For each 
m ~ 2^, suppose that Im ^ ^ 2^^] is an interval. Then 



Remark. In [32] an alternative argument (related to the large sieve) is used, in which 
the same estimate is established under the weaker assumption that 2^ = Oe(X^/^~^). 
Proof. For m, n in the ranges under consideration we have have f{mn) = fk{mn), where 
k := ji + u. Fix a value of m, and write Sm '■= XlnG/m /("^^)- We thus have 

Sm= /(^)l-p(^)' 

where P = Pm '■= {mn : n G Im}- By Parseval's identity this is 

2'^ Y /.(r)lp(r), 

which is bounded by 2'^'||/^.||oo||lp||i- By summing a geometric series we have 

|lp(r)| <min(l, \\r /2''\\~^), 
and therefore ||lp||i ^ A; ^ logX. It follows from Lemma F2.5.2I that 

5™<2'=X-^/i°logX. 

Summing over m ~ 2^, we obtain the required bound. □ 



2.7. Two DIOPHANTINE LEMMAS. In this section we give two results of a 'diophantine' 
nature which we will use in the next section, in which Type II sums are analysed. Both 
of these are more-or-less standard in this subject. 

Lemma 2.7.1 (Vinogradov). Suppose that q,Q,R are all natural numbers less than X , 
that /3 G M, and suppose that (a, g) = 1. Then 

R 

Y min (Q, \\ax/q + P\\~^) ^ Q + q + R + QR/q. 

x=0 
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Proof. The key observation is that as x ranges over any interval of length g, the numbers 
aa;/g(mod 1) range over the set 0, 1/g, . . . , (g — l)/q. It follows easily that if I is any 
interval of length at most q then 

^min [Q,\\ax/q + <^ Q + qlogq. 

Since the range x = 1, . . . , R may be split into at most R/Q + 1 such ranges, the result 
follows immediately. □ 

Lemma 2.7.2 (Equidistribution lemma). Suppose that a G M and that I is an interval 
of integers with \I\ = N. Suppose that 61,62 satisfy 61 < 52/16, and suppose that there 
are at least 62N elements n E I for which \\an\\^ii ^ 61. Suppose that N ^ 8/52- Then 
there is some q ^ 8/52 such that ||a;g||K/z ^ A6i62^N^^ . 



Proof. By a well-known argument of Dirichlet there is some q ^ N and an a coprime to 
q such that \a — a/q\ ^ 1/qN. Set 9 := a — a/q, let uq E I he fixed, and set /3 := no9. 
Then for any n G / we have, by the triangle inequality, 

lla'^lk/z ^ \\an/q + PWk/^- ||6'(n - no) ||r/z ^ \\an/q + PW^/z - l/q, 

and so 

\\an/q + P\\^/z^6, + - (2.13) 

q 

for at least ^2-^ values of n G /. 

Now as n ranges over any interval of length q the numbers an/q range over the set 
{0, 1/g, 2/g, . . . }. Thus in any such interval the number of n satisfying fl2.14p is no 
more than 6iq + 2. Now / may be divided into at most N/q + 1 intervals of length no 
more than q, and thus the number of n G / satisfying (12.131) is bounded by 

(- + l)(5ig + 2). 

q 

On the other hand, we are assuming this quantity is at least 62N. Since 62 ^ 4(5i, 
N ^ 8/62 and q ^ N this easily implies that q ^ 8/^2. 

Now the assumption of the lemma implies, by piegonholing, that ||a;n||]R/z ^ 26i for at 
least 62N/2 values of n G {1, . . . , N/2}. We have 

an ^ 
an = h on 

q 

where |^| ^ 1/qN. If n ^ N/2, it follows that either 

||«n||K/z ^ l/2g ^ 62/I6 > 61 
or else n is a multiple of q. But for k ^ N/2q we have simply 

||a/i;gi|R/z = \Okq\. 



Thus \9kq\ ^ 26i for at least 62N/2 such values of k, and in particular for some k^ ^ 
62N/2. Hence 16*1 ^ 26i/qko ^ A6i/q62N, which implies the result. □ 
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2.8. Analysis of Type II sums. Wc start with an inequality which is often used in 
the estimation of Type II sums, van der Corput's inequality. 

Van der Corput's inequality. Van der Corput's inequality is a kind of generalisation of 
the Cauchy-Schwarz inequality. 

Lemma 2.8.1 (van der Corput inequality). Let N,H be positive integers and suppose 
that {an)ne{i,....N} ^'-^ ^ sequence of complex numbers. Extend (a„) to all ofZ by defining 
Qn := when n ^ (1, . . . , N}. Then 

,2 N + H . I^kv^ 

n \h\^H n 



Proof. We have 

H-l 



H 

-H<n^N h=0 



Thus, applying the Cauchy-Schwarz inequality, we have 

H-l 

T'n+hl 



2 1 



N + H 



H<n^N h=0 

H-l 

J2 1 



H^ 

-H<n^N h=0 



+h\ 



N + H H-l H-l 

-H<n^N h=0 h'=0 

which equals the right hand side of the claimed inequality. This concludes the proof. □ 

Proposition 2.8.2 (Estimate for Type II sums). Suppose that X^/^°° ^ 2^, 2*^ ^ 
j^99/ioo_ Suppose that (am) o.nd (6„)n~2'' CLTC arbitrary sequences of complex numbers 
with |am|, \hn\ ^ 1- Then 



'l — K 

for some absolute k > 0. 



Proof. We may assume without loss of generality that p ^ jj,. Suppose for a contradic- 
tion that the result is false. We write this 

^ ambnfimn) > 2^^+^ 

m~2M 71^2" 



By Cauchy-Schwarz we obtain 

I j;6./(mn)|^>2-+-. 

m~2M n~2^ 
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Let p := c(/i + v), where c > is a constant to be chosen later, and apply the van der 
Corput inequality (Lemma 12.8.11) with H ■.= 2p. It follows that 

m~2f n~2'' \h\^H 

The h = term is negligible, and so we obtain 

E E I E /(^(^ + ~ 2''+'^^'- (2-14) 

l^|/i|<2P n~2'' m~2M 

Our aim is to obtain some cancellation in the inner sum and thereby reach a contra- 
diction. The first step is to show that / can be replaced by the truncated function fk, 
where k := fi + 2p (say) is not much bigger than p. 

Now we have 

s{m{n + h)) — Sk{m{n + h)) = s{mn) — Sk{mn) 

if mn,m{n + h) lie in the same interval [2''t,2''{t + 1)]. Since mh is much smaller 
than 2^^, this will happen most of the time. In fact for it not to hold we must have 
mn G p^t — 2^'^'^, 2'^t] for some t (that is, there is a "carry" on adding mh to mn). Now 
for X ^ X the divisor function r satisfies r(x) ^ 1, and therefore the number of "bad" 
pairs (m, n) is at most 

J] J2 r{l) < 2^^-'^. 

The contribution of the bad pairs to (12.141) is therefore negligible, and we may replace 
that inequality by 

E E I E /'^(^(^ + h))M^\ Z 2^+^^+^. (2.15) 

Let us now focus on the inner sum 

En,h ■= E + h))fkimn), 

which we will study using Fourier analysis on Z/2*^Z. Expanding both copies of fk using 
the inversion formula, we see that 

En,h= 2^ 2^ fk{r)fk{s)e{ j, 

m~2'' r,sGZ/2*Z 

which is bounded by 

„ r(n + /i) + 



\fk{r)ms)\mm{2^ 



2k 



From (12.151) we thus have 

E I^WII^(^)lE E min(2M r^^+il^ + ^^ ||-^)>2^+-+^ (2.16) 



2k 

r,seZ/2>=Z n~2'' l<;|/i|^2P 
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Define the weights 

- 2-'"-' E E n.in(2M| '-'"+^l' + -" r'). 
Then f l2.16p becomes 

J2 \Mr)\\Ms)\u;r,s^l. (2.17) 

To do anything with this we need to gain a greater understanding of the weights ujr 
Lemma 2.8.3 (Size of uj^^s)- Suppose that i'2{r + s) = t. Then 

Proof. Write (r + s)/2'^ = a/2'^~*, where is odd. Thus 

Thus 



ujr,s^2-^-'' sup ^min(2M|-^n+-/i|ri) 



By Lemma 12.7.11 we obtain 

uJr,s ^ + 2'^ + 2'=-*-^-'^ + 2*"^ 
Recalling that u ^ fj,, v/e see that this implies the claimed bound. □ 

Let us recall (12.171) . This clearly implies that there is t, ^ t ^ fc, such that 

J2 \fkir)\\Ms)\u;r,sZl. (2.18) 

r,sGZ/2*=Z 

Now by Lemma [2.8.31 Proposition 12.5.31 and Parseval's inequality (in turn) we have 
J2 \fk{r)ms)\u^r,s 

r,seZ/2'=Z 

aGZ/2'Zr=a(mod2*) s=-a(mod2«) 

<2(fc-*){i-c)(2-M + 2^-*-'^-^ + 2*-^) Y \fM)\\k-a)\ 

aGZ/2*Z 

for some absolute constant c > 0. Recalling that k = fi + 2p we see that the first two 
terms are at most 2^''"^'^, which is not ^ 1 if p is chosen sufficiently small. Thus we see 
that if fl^JHD holds then 2-^('=-*) > 1, which implies that 2* > 2^. 
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Assume, then, that this is the case. Applying Parseval's identity again (or Proposition 
12.5.3^ but this is overkill now) we obtain, for any e, 



r,seZ/2*Z a6Z/2*Zr=a(mod2*) s=-a(mod2*) 

U2{r+s)=t 



^ 2^k—t—ek <^ 2 



ek 



It follows that, in fl2.18p . we may restrict attention to values of r, s for which Ur^s ~ 1, 
thus 



J2 |A(r)||/,(.)k,.> 1. (2.19) 

r,sGZ/2'=Z 

In the next lemma we will show that there are rather few pairs (r, s) with Ur^s ~ 1- To 
do this we will make use of the as yet unexploited averaging over h which occurs in the 
definition of u!r,s- 

Lemma 2.8.4 (Large values of uJr,s)- There are ^ 2^ pairs (r, s) G Z/2'^Z x Z/2'^Z such 
that uJr,s ~ 1- 

Proof. We know already, by Lemma 12.8.31 that we must have i'2{r + s) = t, where 
2* ^ 2''. Write (r + s)/2^ = a/2^~\ where a is odd. We have 

E E--(2Mi^- + ^^ir^)^2-^^^''- 

Dividing the sum over n into residue classes (mod 2^^^*), of which there are ^ 1, we see 
that there is 6 such that 

^ min(2M|^ + ^/^||-i)>2'^+''. (2.20) 

Although this looks to be of the form covered by Vinogradov's lemma (Lemma I2.7.ip . 
the fact that 2'' ^ 2'^ means that more information may be gleaned by appealing 
instead to Lemma (2321 From (CTIj) it follows that for > 2p values of h, 1 ^ \h\ < 2^, 
we have 

\\9 + ^h\\ <2-^ 

Fixing some ho with this property and considering the numbers h — ho, we may assume 
that 6 = 0. Applying Lemma 12.7.21 we see that 

||gr/2iK/z^2^^-'' 

for some ~ 1- Thus if oOr^s ~ 1 then there is some 5' ~ 1 such that |rg(mod 2*^)1 ^ 2^ 
(recall that k = fi + 2p). There are ^ 2^ such values of r, and for each of them there 
are just ^ 1 values of s such that t := V2{f + s) satisfies 2* ^ 2^^. □ 
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It is now an easy matter to show that (12.191) cannot hold which, as we have shown, 
imphes that Type II sums are small. Indeed using Lemma fl2.8.4l) and Proposition 12.5.21 
we obtain 



J2 \h{r)ms)\u;r,s^2^-''''. 

r,sgZ/2''Z 

This contradicts fl2.19p if p is chosen sufficiently small. □ 

Proof of Theorem \2.1.1[ Almost nothing more need be said: Theorem 12.1.11 is an 
immediate consequence of Propositions 12.3.11 12.6.11 and I2.8.2[ □ 



3. Patterns of primes 

The aim of this section is to discuss work of the author and T. Tao on linear equations 
in primes. This programme is not yet completed and what has been done so far is 
spread across a number of papers [211 ESI ESI Ell ESI ES] . It is also discussed in several 
expository articles ^1?! [T9lli2lli3l Hi] . 

Our aim here is to do little more than try to describe what we have proved and what we 
hope to prove, and to furnish the reader with some idea of how the various papers fit 
together. In other words we hope that this section may be used as a sort of 'roadmap' for 
readers interested in a more detailed study of the papers. There are some conspicuous 
absences from the current section. Our treatment of the history of the subject is very 
patchy (see the article |34] for this), and we omit any discussion of links with the ergodic 
theory literature (see 33j for this). References are to the versions of the papers which 
were available on www.arxiv.org on October 1st 2007. It is quite likely that published 
versions will have different theorem numberings. 

Suppose that ipi, . . . ,i(jt '■ — »■ are affine- linear forms. The basic questions which 
motivate our work are the following. 

Question 3.0.5 (Existence of prime values). Are there values of n G Z"' for which these 
forms all take prime values? Are there infinitely many such n? 

Question 3.0.6 (Asymptotics) . How many such n are there inside the box [— A^, A^]"^? 

In this generality, our questions contain many of the classical questions in additive prime 
number theory. 

(i) When d = 1, t = 2, ipi{n) = n and 'ip2{n) = 2m — n we have the Goldhach 
Conjecture: is 2m the sum of two primes? 

(ii) When d = 1, t = 2, il)i{n) = n and 'ip2{n) = n + 2 we have the Twin Prime 
Conjecture: are there infinitely pairs of primes which differ by 2? 



28 



BEN GREEN 



(iii) When d = 2, t = 3, ipi{^) = 4'2{n) = ^2 and ips^n) = m — rii — n2 {m odd) 
we have the Ternary Goldhach Conjecture: is m the sum of three primes? 

(iv) When d = 2, t = k and ipi{n) = ni + (z — l)n2, z = 1, . . . , we have the question 
of whether there exist arithmetic progressions of length k consisting entirely of 
primes. 

Essentially nothing is known about either Question 13.0.51 or Question 13.0.61 in cases 
(i) and (ii). Both Questions were answered in case (iii) some seventy years ago by 
Vinogradov, building on earlier work of Hardy and Littlewood. Question 13.0.51 was 
answered in case (iv) by the author and Tao [21]. Question 13.0.61 in that case is much 
harder and was answered for /c = 3 by Chowla and van der Corput in 1939 and for 
= 4 by the author and Tao [221 ESI El] . Question 13.0.61 for ^ 5 is one of the main 
goals of our current programme of research, and we shall report on what progress has 
been made so far. We will not give any further separate discussion of Question 13.0.51 as 
this has now been exposited in many places. Particularly recommended are the articles 
[2H] and [Sg. See also [IHl 1121 SSI ilj • 

There are very natural conjectural answers to Questions 13.0.51 and 13.0.61 It is clear that 
congruence conditions may result in there being no, or very few, choices of n for which all 
of the forms ipi{n) are prime. A trivial example is the system tpi{n) = n, 'ip2{n) = n + 7 
- consideration of this (mod 2) obviously implies that ipiin) and ip2{n) cannot both be 
prime. Congruence conditions may alter our expectations in more subtle ways too. For 
example if n is known to be prime (and is not 2) then one feels that n + 2 is more likely 
to be prime than a random integer of the same size, for it is already known to be odd. 
Pulling against this, however, is the observation that if n 7^ 3 then n is congruent to 
either 1 or 2 (mod 3), and so n + 2 is congruent to either or l(mod 3), but never to 2. 
On this (mod 3) evidence one feels that n + 2 is less likely to be prime than a random 
integer of the same size. Another obvious way in which one could fail to have any 
prime values among the ipi{n) is if they cannot be simultaneously positive, for example 
ipiin) = ni - n2, ^2(^) = ^2 - n^, ip3{n) = - Ui. 

A more profound analysis of these intuitions suggests the following conjecture. In the 
formulation of this conjecture we use the local von Mangoldt functions A^/p^, defined 



Dickson's Conjecture. Suppose that d,t, N ^ 1 are integers. Suppose that no two 
of the forms ipi are rational multiples of one another, and that no form ipi is constant. 
Write ipi{n) = lnUi + ■ • ■ + lidJid + h-i and suppose that we have ^ L, \bi\ ^ LN for 
some real number L. Then for any convex body K C [— A^, A^]*^ we have 



by 




A(^i(^)) • • • KMn)) =Poo\{^ + Od^tAN"), 



where the local factors [3p are defined by 
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and 

(5^ := vo\{K n ^r^(M^o) n ■ ■ ■ n ^^(M^o))- 

Remarks. We have counted primes weighted using the von Mangoldt function, as this 
gives tidier expressions. For an unweighted version see [2H Conjecture 1.4]. It is quite 
fun to play with particular cases of the conjecture. For example with d = 1, t = 2, 
■01 (n) = n, ip2{n) = n + 2 and K = [0, A^] we obtain the conjecture 

E ^(^)^(^ + 2) = 2 n (1 - 7^^)^ + o{N) 

for the (weighted) number of twin primes less than or equal to A^. The numerical value 
of the constant here is about 1.32. Some other examples are worked out in [2H §1]. 

Let us return to the specific examples (i) - (iv) mentioned above. What makes some of 
these questions much easier than others? The most important parameter in determining 
the difficulty of Questions 13.0.51 and 13.0.61 is the complexity of the system of forms 

{i)i,...,i)t}. 

The definition of complexity involves nothing more than simple linear algebra, but it is 
not especially illuminating at first sight. Let s be a positive integer. We say that the 
complexity of the system {t/'i, . . . , } is at most s if, for any i G {1, . . . , t}, the forms 
{ipi, . . . , V'i-i! ^j+i! • • • ! ^t} may be divided into s + 1 classes in such a way that ipi is not 
in the affine linear span of any of them. Thus the system {ni, ni +n2, ni + 2n2, ni + 3n2} 
has complexity at most 2 since, quite obviously, we may remove any one form and divide 
the remaining three forms into singleton classes whose affine span does not contain that 
form. However the complexity of this system is not at most 1 : if we remove the form Ui 
then it is impossible to divide the remaining forms {rzi + n2, ni + 2n2, Ui + 3n2} into two 
classes such that ui is not in the affine linear span of any class. Thus the complexity 
of this system is exactly two. The complexity of the system {ni,ni + n2,ni + 2^2} 
is one, as is the complexity of the system {ni,n2,m — Ui — 7^2} for any fixed m. The 
complexity of the system {n, n + 2} is apparently undefined, since if we remove the form 
n it is impossible to partition the singleton class {n + 2} in any way such that n is 
not contained in the affine span of a class. In such cases we say that the complexity is 
infinite. The system {n,2m — n}, m fixed, arising from the Goldbach Conjecture has 
infinite complexity. 

Roughly speaking, only systems of complexity one were understood before the recent 
work [221 [231 [21] • Much of the theory in the complexity one case was worked out in a 
paper of Balog [3] , which built upon the techniques of Vinogradov. 

Complexity is the most important measure of how difficult it is to solve Questions 13.0.51 
and 13.0.61 However there are some rather trivial examples of systems of complexity 
greater than one for which Question 13.0.51 can be answered. For example just using the 
fact that there are ^ N/ logN primes less than and the Cauchy-Schwarz inequality 
it is possible to show that there are 3> N'^/ log^ quadruples (rii, ^2, n^, n^) such that 
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all eight of the forms 

{rii, ni + n2, rii + ^3, rii + 71,4, ^1 + ^2 + ^3, ni + ^2 + ^4, ni + ^3 + 71,4, rii + n2 + 71,3 + 77,4} 
are prime. This system has complexity 2. 

The reason that complexity one systems have proved amenable to attack is that they 
can be studied using harmonic analysis, and in particular the circle method of Hardy 
and Littlewood. We are now going to discuss some ideas behind this method in a rather 
unorthodox way. It turns out that this is the easiest (in fact so far the only) description 
to generalize to systems of complexity 2 and higher. The following formulation of the 
principle that 'harmonic analysis governs systems of complexity 1' is established in [24] . 

Proposition 3.0.7. Let N be a prime. Suppose that {ipi, . . . , tpt} is a system of affine- 
linear forms of complexity 1. Suppose that fi, . . . , ft : Z/A^Z [^1)1] '^ire functions 
and that 

|E^e(z/ra)d/i(V^i(^)) • • • ft{M^))\ > ^- (3-1) 
Then for each i G [t] there is some r G Z/A^Z such that 

\Enezmf^{n)e{-rn/N)\ >5 1. (3.2) 

In words, if the functions fi, . . . , ft behave in some way unexpectedly when evaluated 
along the linear forms ipi, this phenomenon can be detected by evaluating the Fourier 
coefficients of the fi. 

Proposition 13 . . 7l is proved in two stages. The first step is to establish a generalized von 
Neumann theorem, which is a bound of the form 

|E^G(z/ra)d/i(^i(^)) • • • ft{Mn))\ < inf \\fi\\u^. 

Here ||/||t/2 denotes the Gowers U'^-norm of / and is defined by 

11/11^2 := 'E^^h^^h2&/Nzf{x)f{x + hi)f{x + h2)f{x + hi + /ig). 

Results of this type are proved using nothing more than a few applications of the 
Cauchy-Schwarz inequality, although the notation can get quite complicated. A simple 
example is given in the proof of pj], Proposition 1.9]. Foundational material on the 
Gowers norms (including, for example, a proof that they are norms) may be found in 
HSl Chapter 11] or [211 Chapter 5]. 

This first step in the proof of Proposition 13.0.71 leads from the hypothesis (13.11) to the 
conclusion that each ||/ji|c/2 is at least 5. To obtain the conclusion (13. 2p . then, it suffices 
to establish an Inverse Theorem for the Gowers [/^-norm, stating that if the t/^-norm 
of / is large then / correlates with a linear phase. 

The proof of this result is so short we give it here. Suppose that / : Z/A^Z 1] is 

a function with ||/||(72 ^ 5. Define the Fourier transform / : 'L/N'L — > C by 

/(r) ■.= E,,^^,Nzf{n)e{-rn/N). 
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Using orthogonality relations one may easily check that 



Il/ll4:=( E \firty^"=\\f\\u^- 



(3.3) 




ll/llLll/ll2^ll/ll4^5'- 



But by Parseval's identity we have 



ll/l|2= 11/112^1, 



and so we conclude that 



ll/l|oo^5^. 

That is, there is some r G Z/A^Z such that 

|E„ez/ivz/(n)e(-rn/iV)| 



(3.4) 



We note that (13.31) also gives a converse result: if there is some r G Z/NZ such that 



Proposition 13.0.71 is a somewhat convincing way to formulate the idea that 'harmonic 
analysis can handle systems of complexity one'. For a variety of reasons, however, it is 
not immediately applicable to an understanding of the quantity 



appearing in Dickson's Conjecture. One obvious point is that the average in Proposition 
13.0.71 is over (Z/A^Z)'^, rather than over [A^]'' or over the lattice points inside a convex 
body. This is a purely technical distinction, and is dealt with in [2ll Appendix C]. 

A more serious problem arises from consideration of how Proposition 13.0.71 might be 
applied. There is certainly no mileage to be gained from applying it in the most naive 
way, that is to say with /i = ••• = /* = A. Indeed in that case condition (13.21) does hold 
(with r = 0), and so we cannot rule out the possibility of (13. ip holding (and, of course, 
Dickson's Conjecture predicts that (13.11) does hold much of the time). To eliminate the 
possibility of (13. 2p holding with r = we might split A = 1 + (A — 1), expand (13.51) as a 
sum of 2* terms, and use Proposition 13.0.71 to show that all of the terms except the one 
with fi = ■ ■ ■ = ft = 1 are 'negligible'. This also fails to work, for the simple reason 
that those other terms may not be negligible. Indeed if they were then the arithmetic 
constant in Dickson's Conjecture would be simply 1 rather than the product Poo Yip f^p 
reflecting the distribution of primes (mod 2), (mod 3) etc. 

In all of the the work of the author and Tao on primes these issues are bypassed by 
means of the so-called VT-trick. The trick is easiest to describe in the context of our 
paper [21] establishing the existence of arbitrarily long progressions of primes. Look 
at the primes 2, 3, 5, 7, ... . They are very irregularly distributed (mod 2). However if 
one deletes the element 2 and rescales the remaining primes by the map x i— > [x — l)/2 
one ends up with the sequence 1, 2, 3, 5, 6, 8, ... . This is now quite regularly distributed 
(mod 2), because there are roughly the same number of primes congruent to l(mod 4) 
as there are primes congruent to 3(mod 4). Furthermore if one finds a long arithmetic 



E^,^^/r,zf{n)e{-rn/N)\^6 



then ||/||f;2 ^ 6. 




(3.5) 
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progression in the new sequence it translates immediately to a long progression in the 
primes. 

Unfortunately, however, this new sequence is not well-distributed (mod 3), as the only 
element congruent to l(mod 3) that it contains is 1. However we could pick out the 
elements divisible by 3, that is to say 3, 6, 9, 15, . . . and divide through by 3 to obtain 
the new sequence 1, 2, 3, 5, ... . This is now well-distributed modulo both 2 and 3. 

The process may be continued with the primes up to some threshold w{N). Choosing 
w{N) to tend to infinity with N, the resulting sequences have no appreciable biases in 
small modulo small numbers. 

It is in fact quite easy to formalise this idea and to see how it may be applied to quite 
general problems such as (13.51) . The sequences that result from the sieving process we 
have just described are of the form 

{n : Wn + b is prime} 

where := 2 x 3 x ■ ■ ■ x w{N) (hence the name W-trick') and {b,W) = 1. For 
consideration of the weighted sum appearing in (13.51) it is rather natural to introduce 
the functions 

A,Mn) ■.= ^A{Wn + b), 

which have average value roughly 1. Roughly speaking, the sum in (13.51) may be split 
into (l){Wy sums of the form 

'^Ab^^wii'iin)) . . .Ab^^w{^d{n)) (3.6) 

(for the details of this decomposition see [211 Chapter 5]). One might then attempt to 
evaluate each of these by splitting 

Ab,,w = 1 + (Afe„w/ - 1) 

and then apply Proposition 13.0.71 to show that all of the terms except that with /i = 
■ ■ ■ = ft = I are negligible. 

Such an approach is promising, but there is one serious additional problem. Proposition 
13. 0.71 only applied, as we stated it, to functions fi which are bounded by 1. The functions 
Abi,w ~ 1 cire certainly not bounded by one. Indeed, the harmonic analysis argument 
leading up to (13.41) relied on this boundedness in a rather essential way. 

It turns out that there is a version of Proposition 13 . . 71 which applies to functions which 
are not necessarily bounded by 1; this is one of the main results of [2^, and the key 
idea was also an important component of our earlier work [2T] . 

For simplicity let us think about the von Mangoldt function A itself, rather than the 
'W-tricked' variants Ab^w Let R = N'^, where 7 G (0, 1) is a real number, and let us 
recall the discussion of ^ where we observed that 
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^ ^ ^ d\n 

is a sensible majorant for the characteristic function of the primes between R and A^, 
where 



logR 



It follows the function 



2 



d\n 
di^R 

essentially majorizes some multiple C^A(n) of the von Mangoldt function on the interval 
[N] (where by essentially we mean that there may be problems when n ^ or n is a 
prime power, but these are highly unimportant exceptions). We note that the link we 
have just made to the work of Goldston, Pintz and Yildirim is by no means artificial; 
indeed a crucial moment in the development of [2Tj occurred when Andrew Granville 
drew our attention to [9J. 

The weights u are much more flexible than A, since it is possible to evaluate such sums 
as 

E„^Nu{n)u{n + 2) 

asymptotically by variants of the computations leading to (11.31) . As in those computa- 
tions the key feature is that by choosing the parameter 7 to be small enough the number 
of terms which result from expanding out the sums over d is small enough that error 
terms do not dominate. 



In fact (after an application of the H^-trick discussed above and with an appropriate 
choice of 7 = 'y{t,d)) the weights are sufficiently flexible that they may be shown to 
satisfy two technical conditions called the linear forms and correlation conditions. These 
conditions were introduced in [2n §3], and the variants of these conditions appropriate 
for a discussion of Dickson's Conjecture are given in [211 §6]- As a result of this the 
weight u qualifies to be called pseudorandom. 

As the reader may have guessed from the above discussion, it is possible to prove a 
version of Proposition 13.0.71 in which the condition that the fi take values in [—1, 1] is 
relaxed to a condition |/j(a;)| ^ i^{x), where u is a pseudorandom weight function. The 
first step is to establish 'generalized von Neumann'-type results in which the functions 
fi are bounded by u, rather than just by 1. Specifically, one deduces from an assump- 
tion (13. ip that each of the Cowers norms ||/i||[/2 is somewhat large. This is once again 
accomplished by several applications of the Cauchy-Schwarz inequality, but the pres- 
ence of the weight p makes the details even more complicated. For a fully worked-out 
example, see |2Tl §5]. Once this is done one must establish an inverse theorem for the 
Cowers [/^-norm for functions / with |/(a;)| ^ z/(x). As we remarked, some new ideas 
are required here since the harmonic analysis argument leading up to (13. 4p breaks down 
if / is not bounded by one. 
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In fact this inverse theorem is deduced from the version with |/i(a;)| ^ 1 by means of 
the following decomposition result, which is [24, Proposition 10.3]. 

Lemma 3.0.8 (Decomposition). Suppose that v : Z/iVZ M^o is a pseudorandom 
measure, and that f : Z/A^Z — >■ M is a function with \f{x)\ ^ z/(x) for all x. Then we 
may decompose 

/ = /l + /2 

where |/i(a;)| ^ 1 for all x and ||/2||;72 = o(l) as N ^ oo. 

We shall almost nothing about the proof of this result, but it is also one of the key 
ingredients in [21]. For a discussion, see [13[ §6]. For a broader discussion of the 
'energy-increment' strategy used in the proof, which appears in many different contexts 
in additive combinatorics, see [15] . 

Suppose that \f{x)\ ^ u{x) and that ||/||{/2 ^ 6. Applying Lemma r3.0.8[ we see that 
1 1 /i I If/2 ^ 6/2. By the inverse theorem for the [/^-norm of bounded functions, there is 
some r G Z/iVZ such that 

|E„ez/ivz/i(ri)e(-m/Ar)| ^ 6^4. 

However by the converse of the inverse theorem and the fact that ||/2||(72 = o(l) we see 
that 

\^nez/Nzf2{n)e{-rn/N)\ = o(l) 

(Note that the proof of this converse result did not require /2 to be bounded by 1). By 
the triangle inequality it follows that 

\Enez/Nzf{n)e{-rn/N)\ ^ 6^4 - o(l) ^ 6^8, 

which concludes the 'transference' of the inverse theorem for the f/^-norm from functions 
bounded by 1 to functions bounded by z/. 

Let us take stock of our position. We have indicated a proof of Proposition 13.0.71 when 
the functions /, are bounded by a pseudorandom weight u, a fairly robust realisation 
of the principle that harmonic analysis governs the behaviour of systems of complexity 
one. We have split the von Mangoldt function A into functions A;, jy, and rewritten the 
sum which interests us, namely (13.51) . as a sum of expressions of the form (13. 6p . To 
evaluate these we effect the further splitting Af,^w = 1 + {M,w — 1), and hope to show 
that any sum 

J2fiiMn))...ft{Mn)) 

in which at least one /j equals Ah w — 1 is negligible. All of these functions are essentially 
dominated by some pseudorandom weight u of the type considered by Goldston and 
Yildirim, and so by our robust version of Proposition 13.0.71 it suffices to rule out the 
possiblity that A^^w — 1 correlates with a linear phase function; that is to say, we must 
establish that 

lEnM^bM^) - l)e{-rn/N)\ = o(l). (3.7) 

This estimate may be established in a fairly classical fashion using the ideas of Hardy, 
Littlewood and Vinogradov. In [24j we proceed by first effecting some further decom- 
positions, in keeping with our general philosophy that problems should be 'transferred' 
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to a situation where functions are bounded by 1. We skip the details (which may be 
found in ^24^ §12]) and merely state that (13.71) can be deduced from the estimate 

|E„^^/i(n)e(-rn/iV) | <^ log"^ N, (3.8) 

for some suitably large A. That such an estimate holds (with any A) is a well-known 
result of Davenport [3]. To prove it one may use the method of Type I/Type II sums dis- 
cussed in ^ In fact, Propositon 12 . 3 . 1 1 is true with the von Mangoldt function A replaced 
by the Mobius function /x. The proof is almost the same, relying on a decomposition of 
/X which is very similar to Vaughan's decomposition of A. 

The remarks following the statement of Proposition 12.3. 11 are particularly relevant here. 
We can hope that the method of Type I /II sums will be effective in bounding (13.81) 
unless r/N is approximately a/q, for some small q (that is, the method ought to be 
successful in the 'minor arc' case). Luckily in the 'major arc' case one may approximate 
e{—rn/N) by the sum of a few Dirichlet characters to modulus q. The resulting sums 
^^^.^ /i(n)x(?^) may then be estimating using standard techniques of analytic number 
theory together with information about the non-existence of zeros near 3fJs = 1 of the 
L-functions L{s,x)- for details see [321 Prop 5.29]. 

This concludes our discussion of a proof of Dickson's Conjecture for systems of com- 
plexity one. As we have remarked, this result could also be obtained by a more classical 
application of the circle method. However it turns out that large parts of our discussion 
adapt very painlessly to systems of complexity s > 1 , whereas this does not seem to be 
the case for the classical techniques. 

One has, for example, the following bound of generalized von Neumann type: 

l^n&iz/Nzrfiitpiin)) . . . ftiiptin))] ^ inf ]|/i||c/.+i (3.9) 

ielt] 

for systems ipi: ■ ■ ■ ^i^t of complexity s, where H/Hc/fe is the Gowers U^-norm of / and is 
defined by 

WfWuk ■.= ^xM,-,h^: JJ f{x + iUihi^ huJkhk) 

a;i,...,Wfce{0,l} 

(with complex conjugates being taken of the terms with an odd number of his). This is 
true even if the functions fi are only bounded by a pseudorandom weight u. A statement 
very close to (13.91) is proved in [2ll Appendix C]. 

The decomposition result. Lemma [3.0.81 also adapts in a fairly painless manner. 

The first really serious issue that we encounter is in finding a generalization of the inverse 
theorem for the t/^-norm. If / : Z/A^Z [—1,1] is a function such that ||/||c/3 ^ 6, 
what can be said? The most immediate difficulty with attacking this statement is the 
lack of a suitable formula generalizing the relation H/Hc/z = ||/||4. A more decisive 
problem is revealed by consideration of the function f{n) = e(n^/A^). One may check 
that 11/11(73 = 1 (this is essentially a manifestation of the fact that the third derivative 
of a quadratic is zero). With somewhat more effort one may check that this / does 
not have substantial correlation with a linear exponential e{rn/N). Thus an inverse 
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theorem for the [/^-norm must encode some kind of 'higher harmonic analysis' which 
takes account of these quadratic phases as well as just linear ones. The situation is 
further complicated by the existence of further examples, such as f{n) = e{n[n\/2]/N), 
for which ||/||[73 is large, but for which / does not even exhibit genuine quadratic 
behaviour. A full discussion of examples related to these may be found in [1] 



It turns out that these two examples, f{n) = e{'n? /N) and f{n) = e{n[n^/2]/N), can 
both be interpreted as objects living on a 2-step nilmanifold, that is to say a quotient 
G/r where G is a 2-step nilpotent Lie group and F is a discrete and cocompact subgroup. 
The archetypal example is the Heisenberg example in which 



„ ,1IRR\ , _ fizz 

G = I 1 R and 1 = o i z 
001/ Vooi 



Quadratic polynomials appear quite naturally in such a group G, for instance in the 
computation 

1 « /3 ^ f 1 na n/3+in(n— 1)07 



1 7 ) — 1 n7 

1/ Vo 1 
Taking such a sequence of matrices and postmultiplying by elements of F so that all 
of the entries lie between [—1/2, 1/2]'^ (that is, reducing to a fundamental domain for 
the action of F on G) one soon sees the appearance of the more complicated 'bracket 
quadratics' 7n[a?2] too. A fuller discussion, with motivating remarks, may be found in 

m- 

What is more, correlation with an example arising in this setting is the only way in 
which a function / can have large ?7^-norm. This is the inverse theorem for the U^- 
norm, proved in [22] building on ideas of Gowers p^. It is conjectured that an analogous 
result holds for the t/^+^-norm in general, a conjecture we refer to as the Gowers Inverse 
conjecture GI(s). 

Conjecture 3.0.9 (Gowers inverse conjecture GI(s)). Suppose that f : Z/A^Z — >■ [—1, 1] 
is a function and that \\f\\uo+^ ^ ^- Then there is an s-step nilmanifold G /T , a Lipschitz 
function F : G/F [—1, 1] and elements g E G , x E G/F such that 

\En^Nf{n)F{g^xT)\ »5 1. 

The dimension and complexity of G/T and the Lipschitz constant of F are all Osi^)- 



The function n t— > F((7"a;F) is called an s-step nilsequence. To state this conjecture 
properly one must of course define the notion of 'complexity', and also assign a metric 
to G/F so that the notion of Lipschitz constant may be properly formalised. A version 
of the conjecture was first formulated in [211 §8]. There, a metric on G/F was assigned 
quite arbitrarily. With the benefit of hindsight it is probably better to proceed as in our 
more recent paper [25l §2], in which the notions of 'complexity' and 'metric' are both 
developed from the notion of a Mal'cev basis for G/F. 

The precise statements are not important in order to understand the philosophy which 
lies behind Conjecture I3.U.9I and its interplay with the generalized von Neumann the- 
orem 03.91) : it seems as though the correct 'harmonics' with which to study systems 
{ipi, . . . , ipt} of complexity s are the s-step nilsequences. 
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Conjecture 13.0.91 is known when s = 1, and we proved it earlier. Note that the hnear 
exponentials n i— > e{—rn/N) may easily be interpreted as 1-step nilsequences living on 
the nilmanifold M/Z. As we stated, it is also known when s = 2, this being the main 
result of [22]. Tao and I hope to report progress on the general case s ^ 3 in the near 
future. 

Assuming Conjecture 13.0.91 one may start to work through the proof of Dickson's Con- 
jecture in the complexity 1 case and attempt to generalise it to the complexity s sit- 
uation. Replacing occurrences of linear exponentials e{—rn/N) by s-step nilsequences 
F{g'^xr), the argument runs with remarkably few changes. One fairly significant extra 
difficulty occurs in the proof of LemmaEIOISl where a 'converse' to the inverse conjecture 
is required. Namely, one needs to know that if 

\En^Nf{n)F{g^xT)\^S 

for some s-step nilsequence F{g'^xr) then 

where the implied constant may also depend on the complexity of G/T and on the 
Lipschitz constant of F. Such a result is already present in [221 §14], and a somewhat 
more conceptual proof is given in [231 Appendix E] . Both of these appendices were based 
on extensive conversations with members of the ergodic theory community. 

By far the most serious obstacle is the last one, where we reduce to establishing a 
generalization of (13. 8p for nilsequences. In other words we seek the bound 

|E„^Ar/i(n)F((7"xr)| «^ log-^AT 

for all A > 0, where F{g^xr) is an s-step nilsequence arising from some s-step nilman- 
ifold G/r, and the implied constant may also depend on the complexity of G/T and 
the Lipschitz constant of F. This bound is referred to as the Mobius and Nilsequences 
Conjecture MN(s). As we remarked, the conjecture MN(1) was essentially established 
by Davenport. The case MN(2) was obtained in the paper [23]. The general case MN(s) 
has recently been resolved by the authors and will appear in the short paper [26]; the 
key technical ingredient in that proof is the main result of [25], which may be thought 
of as a kind of generalization of the major-minor arc decomposition to nilsequences of 
arbitrary step. The method of Type I/II sums is crucial once more. 

The reader wishing to study any of this work might find the following table helpful. Let 
me once again emphasise that the purpose of this article has been to guide potential 
readers through the papers [211 [231 1231 EH] and [26] ; we do not intend to suggest 
that there is no other work going on in the subject! 

[2T] . The primes contain arbitrarily long arithmetic progressions, independent of the 
other papers except it contains the proof of Decomposition Lemma 13.0.81 

|22] . An inverse theorem for the Cowers U^-norm, with applications, proof of the GI(2) 
conjecture. 
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[23], Quadratic Uniformity of the Mobius function, proof of the MN(2) conjecture, to 
be largely superceded by [26j but, unlike that paper, can be understood without |25j. 

p^ . Linear Equations in primes, proof that the GI(s) and MN(s) conjectures together 
imply Dickson's conjecture for systems {ipi, . . . ,ilJt} of complexity s. The discussion in 
this article has been largely an exposition of some of the ideas in this paper. 

[25] , The Quantitative Behaviour of Polynomial Orbits on Nilmanifolds, key technical 
ingredient for studying nilsequences 

[20], The Mobius and Nilsequences Conjectures, proof of MN(s) conjecture for all s, 
heavily reliant on [25J. 
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